Reducing Dimensionality in Multiple Instance Learning with a Filter Method

نویسندگان

  • Amelia Zafra
  • Mykola Pechenizkiy
  • Sebastián Ventura
چکیده

In this article, we describe a feature selection algorithm which can automatically find relevant features for multiple instance learning. Multiple instance learning is considered an extension of traditional supervised learning where each example is made up of several instances and there is no specific information about particular instance labels. In this scenario, traditional supervised learning can not be applied directly and it is necessary to design new techniques. Our approach is based on principles of the well-known Relief-F algorithm which is extended to select features in this new learning paradigm by modifying the distance, the difference function and computation of the weight of the features. Four different variants of this algorithm are proposed to evaluate their performance in this new learning framework. Experiment results using a representative number of different algorithms show that predictive accuracy improves significantly when a multiple instance learning classifier is learnt on the reduced data set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HyDR-MI: A hybrid algorithm to reduce dimensionality in multiple instance learning

Feature selection techniques have been successfully applied in many applications for making supervised learning more effective and efficient. These techniques have been widely used and studied in traditional supervised learning settings, where each instance is expected to have a label. In multiple instance learning (MIL) each example or bag consists of a variable set of instances, and the label...

متن کامل

ReliefF-MI: An extension of ReliefF to multiple instance learning

In machine learning the so-called curse of dimensionality, pertinent to many classification algorithms, denotes the drastic increase in computational complexity and classification error with data having a great number of dimensions. In this context, feature selection techniques try to reduce dimensionality finding a new more compact representation of instances selecting the most informative fea...

متن کامل

Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...

متن کامل

Different Learning Levels in Multiple-choice and Essay Tests: Immediate and Delayed Retention

    This study investigated the effects of different learning levels, including Remember an Instance (RI), Remember a Generality (RG), and Use a Generality (UG) in multiple-choice and essay tests on immediate and delayed retention. Three-hundred pre-intermediate students participated in the study. Reading passages with multiple-choice and essay questions in different levels of learning were giv...

متن کامل

Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution

Feature selection, as a preprocessing step to machine learning, has been effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this work, we introduce a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010